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Abstract 



' We present results of molecular simulations of a model protein whose hydrophobic collapse 

o : 

O . proceeds as a cascade of downhill transitions between distinct intermediate states. Different inter- 

mediates are stabilized by means of appropriate harmonic constraints, allowing explicit calculation 

> 

^T) ' of the equilibrium free energy landscape. Nonequilibrium collapse trajectories are simulated inde- 

■ pendently and compared to diffusion on the calculated free energy surface. We find that collapse 

(N 

'. generally adheres to this surface, but quantitative agreement is complicated by nonequilibrium 

O 

' effects and by dependence of the diffusion coefficient on position on the surface. 
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In Kramers theory a chemical reaction is modeled by Brownian motion of a particle in 
an effective field of force The force is related to the change in the free energy of the 
system as it moves along a reaction coordinate that parameterizes the physical or chemical 
transformation involved. The rate of the reaction is then determined by the shape of the 
(free) energy landscape and by an effective diffusion constant. In recent research the energy 
landscape has been frequently invoked in discussions of complex biological phenomena, such 
as folding of proteins and RNA IJ]. Furthermore, the energy landscape has been 

measured experimentally for some biophysical systems [sl, where it is assumed that a single 
reaction coordinate captures a process which occurs in a high dimensional phase space. In the 
case of two-state protein folding, theoretical studies have indicated that effective diffusion 
along a reaction coordinate is described by the value of the diffusion coefficient which is 
not constant, as in Kramers theory, but coordinate-dependent 

aa. 

Such dependence of 

diffusion on the position on the free energy surface will be even more important for downhill 
protein folding which occurs in the absence of a free energy barrier and is, therefore, a 
diffusion-limited process. 

In this letter we study thermodynamics and kinetics of polymer hydrophobic collapse 
as a basic model for downhill protein folding. We find that effective diffusion slows down 
by an order of magnitude when the polymer enters the free energy basin of its collapsed 
(ground) state, so that subsequent diffusion across this narrow region of the phase space 
takes an appreciable fraction of the overall collapse time. In the absence of a free energy 
barrier, the application of the diffusive model to protein folding requires an accurate knowl- 
edge of the entire ground state basin, which cannot be deduced from irreversible folding 
trajectories alone. Below we describe an efficient and accurate method for the calculation 
of multi-dimensional free energy surfaces, including high gradient slopes of the ground state 
basin. The method is based on subjecting a polymer to a set of harmonic constraints and 
thereby preventing its collapse to the ground state. Comparing the results of simulations 
of an unconstrained polymer and of the polymer harmonically bound to various positions 
on the free energy surface, we conclude that the freely collapsing polymer fails to achieve 
local equilibrium as it enters the ground state basin. Furthermore, for the specific polymer 
considered here, a local nonequlibrium effect is observed at the entrance to the basin, which 
is caused by a relatively intense collision of different parts of the polymer. Such nonequilib- 
rium effects are beyond the scope of the Kramers model and its use may introduce artifacts 
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FIG. 1: (a) Polymer conformations corresponding to different states. H- monomers are grey (dark), 
P-monomers are green (light). Numbers in brackets indicate monomers comprising compact glob- 
ules, (b) Free energies F and F* for the impeding force /o = (solid) and /o = 0.02e/(7 (dash), 
0.04e/cr (dash-dot). F, F* are given in the units of k^T and -Rg in the units of a. The dashed- 
dotted line is shifted up by AOkBT. 

in the numerical analysis of experimentally observed folding times. It remains an open ques- 
tion how the Kramers theory can be modified to account for both local and more general 
nonequilibrium effects. However, in computational studies as described here, combined sim- 
ulations of irreversible folding trajectories and of harmonically constrained macro molecules 
can provide detailed information on their equilibrium free energies as well as indicate various 
nonequilibrium effects. 

We consider a simple HP model of a protein, consisting of a polymer chain of "hydropho- 
bic" (H) and "hydrophilic" (P) spherical monomers. HH interactions are modeled by the 
attractive Lennard- Jones potential, ULj(r) = 4:e[{a/ry^ — (cr/r)^], where e = bk-oT/?) and 
a is the monomer diameter. HP and PP interactions are modeled by the repulsive part of 
the same potential. The fraction of P-residues in a sequence is chosen to be equal to the 
fraction of monomers on the surface of a compact spherical globule. This criterion yields 
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44 P-residues for a chain of length 100 8j. The data presented below are for the particular 
sequence HPHHHPPPPHPHPHHHPPHHPPHHPPHHPHHHPHHHPPPHPHPHPPPPH- 
PHPPPHHHPHPPHHHHHHHHPHPPHPPHPHHPHHPPHHPHHHHHPHHPPHH, that 
has been generated randomly to satisfy this condition [3]. The polymer bond l eng th a 



and the angle between adjacent bonds are constrained by means of the FENE [10|] and 
harmonic potentials, respectively: u-p{a)/k-BT = — 33.75 ln[l — (2a/3a)^] and Uy{{(1)) / k-oT = 
50(0 — 2n/3)^. The polymer dynamics are simulated by solving the Langevin equation 
for individual monomers i, mvi = —^Vi + Fi + Ri, where m is the monomer mass, 7 
is the drag coefficient, Fi is the conservative force, and Ri is the Gaussian random force, 
{Ri(t)Rj(t')) = G'-fk^TSit—t'). The Langevin equation is integrated using the leap-frog algo- 



rithm with the time step At = O.OOlr and the drag coefficient 7 = mT~^, where r = a^fmfe 
is the unit of time. 

The typical collapse pathway of the HP polymer proceeds through states 3, 2, 1, 0, 
shown in Fig. [TJ^a). To obtain accurate free energies of these states, we have run a series 
of simulations with the impeding force fi = fo{ri — rcm)/RG applied to each monomer z, 
where is the monomer's position and r^m is the polymer's center of mass. It can be shown 
that the total work done by this force field as the polymer moves from one configuration 
to another depends only on the difference between the radii of gyration Rq in the two 
configurations. This leads to the following relationship between the free energies F{Rq) and 
F*{Rq, /o) of the free and force-biased polymers, 

dFiRc) dF*{RG,fo) 



dRc ORg 



+ foN 



= ~^bT ^ h foN, [1) 

oriG 

where M{Rg, /o) is the number of times the radius of gyration is found in the range Rg ± 
ARg/2 for specific /o, and is the degree of polymerization. 

In simulations with different /o, successive intermediates in the collapse pathway have 
been stabilized and extensively sampled to yield F*{Rg, fo). For example, at /o = 0.04£:/cr, 
state 2 becomes an equilibrium state and 3 is a long-lived metastable state, such that the 
irreversible transition from 3 to 2 occurs after a wait time of ~ lO^r. At /q = 0.02£:/o", the 
irreversible transition from state 1 to occurs at a similar time scale of lO^r and is preceded 
by frequent, once per ~ lO^r, reversible transitions between states 2 and 1. 

The unbiased free energy F[Rg) is obtained by averaging the last line in Eq. ([1]) over 



different /o with the wei ghts w(Rg, fo) = M{Rg, fo)/ ^j:^ M{Rq, fo) and integrating the 
result with respect to -Rg [Ul- F{Rg) is indicated by the sohd curve in Fig. [T](b); the dashed 



curve in the same figure is the free energy for /q = 0.02e/a. Note that at /q = 0.02e/a the 
barrier height for the "unfolding" transition from state to 1 appears to be ~ T.Tk^T which 
should be traversed within time scales of the present simulations. In fact this transition is 
not observed to occur, which is a clear indication that the depth of the ground state basin 
has been underestimated. 

The spherical globule in Fig. [T]^a) can be forced open by applying a higher value of the 
pulling force /o = 0.04e/a, for which state 2 is an equilibrium state. During this process, 
however, a different intermediate 1*, shown in Fig. [U^a), is observed in place of 1. When 
irreversible folding transitions are observed (i.e., the reverse transition is observed rarely 
or not at all in simulation) the apparent free energy drop associated with the irreversible 
transition will depend on the actual time that the molecule spends in its intermediate or 
final state before it proceeds further with its collapse/folding or before the simulation run 
is terminated. The assumption that free energy surfaces can be accurately estimated from 
irreversible folding trajectories, common in simulations of protein folding, leads to systematic 
underestimation of the depth of the free energy basin of the native state. For an accurate 
assessment of a deep free energy minimum it is necessary to develop a method that uniformly 
samples large gradient slopes leading down to this minimum, which is not achieved by 
irreversible folding trajectories. 

To address this issue, we have completed a new analysis of free energies with two reaction 
coordinates that distinguish between states 1 and 1*. A choice of reaction coordinates which 
resolves all on-path and off-path metastable states consists of the radius of gyration Rqi of 
the subchain of monomers 1-36 and the distance -Rem between the centers of mass of sub- 
chains 55-100 and 1-12. To systematically investigate the conformational space associated 
with states 0, 1, 1* and 2, we have performed numerous simulations in which i?cm and Rqi are 
bound to various centers R'^^ and Rq^ by harmonic potentials Mhi(-Rcm) = 0.5A;i(i?cm — -Rcm)^ 
and Mh2(-RGi) = 0-5k2{RGi — -Rgi)^' where k\ = k2 = ^.'bejo^ . In contrast to the constant 
force bias employed in Eq. ([1]), these harmonic potentials are sufficiently stiff to stabilize 
polymer trajectories on the steep slopes leading down to a deep free energy minimum, 
which allows an accurate measurement of the depth of this minimum and of the transition 
area. The i?cm and Rg\ derivatives of the unbiased free energy F{Rcm, Rqi) in the vicin- 
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FIG. 2: (a) Free energy surface for polymer collapse from state 2. Two clearly visible trenches 
belong to two distinct pathways connecting states 2 and 0. (b) Contour plot of the same free energy 
surface together with three typical collapse trajectories from state 2. Rem and Rqi are introduced 
in the main text and shown in the units of a. 

ity of each center {Rem, Rqi) readily calculated from similar derivatives of the biased 
free energy F*{Rcm, Rgi, ki, R^my ^2, Rqi) by subtracting from the latter ki{Rcm — Rem) 
k2{RGi — Rgi)^ respectively. To establish the partial derivatives over the entire phase space 
we follow the scheme outlined for the ID case and perform a linear weighted average of 
all available statistics from simulations with different R^^ and Rqi- A free energy surface 
is computed numerically on a grid by minimizing the sum of squared deviations between 
its partial derivatives, represented by finite differences, and their numerical values obtained 
from simulations. 

The result of this computation for F{Rcm, Rgi) is shown in Fig. M^a). The narrow trench 
that runs along the Rem axis indicates the collapse transition from state 1 to 0, whereas 
the wide diagonal trench corresponds to the "unfolding" transition from state to 1*. The 
accuracy of F{Rcmi Rgi) is addressed in auxiliary material [3| where we examine the con- 
formational transition between states 1 and 1*. Figure EJ^b) shows the contour plot of 
R{Rcm, Rgi) together with three representative trajectories for unconstrained {ki = /c2 = 0) 
polymer collapse from state 2. Trajectory A in Fig.[2|^b) follows the typical collapse pathway 
2 ^ 1 ^ 0. However, 683 out of the total of 10000 simulated collapse trajectories never 
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enter the 1 — trench but instead reverse the "unfolding" pathway, similarly to trajectory 
C in Fig. mh). 

When collapse proceeds along its typical path, a nonequilibrium effect is occasionally 
observed at the point when blobs 1-36 and 55-100 come into contact. As a result of vigorous 
mixing caused by the collision of the two blobs, monomers 27-36 may break from blob 1- 
36 and, subsequently, may either return to the same blob, as shown by trajectory B in 
Fig. [2t^b), or join with blob 55-100, which appears as a sudden transition from state 1 
to 1* (not shown). Aside from this local nonequilibrium effect, we ask if the polymer is 
generally in quasi-equilibrium during its collapse, as assumed in diffusive model. To address 
this issue, in Fig. El^a) we compare mean "equilibrium" and "nonequilibrium" times teq and 
tncq of unimpeded polymer collapse from different initial positions in the 1^0 trench. To 
compute icq, we have run extensive simulations of the polymer bound to various positions 
in the trench by harmonic constraints Mhi, Mh25 that are lifted at regular times to initiate 
the total of 180000 independent collapse trajectories, tneq are derived from other 10000 
trajectories which originate in state 2 and traverse the 1^0 trench in passing. The ratio 
tncq/ tcq in Fig. [3](a) shows a pronounced peak in the transition region between states 1 and 
0, indicating delay in collapse caused by the local nonequilibrium effect described above. 
Furthermore, tneq/^q increases monotonically as -Rem 2.5a, which suggests that the freely 
collapsing polymer fails to achieve local equilibrium as it traverses steep portions of the 
reaction pathway. 

We also find evidence that different portions of the free energy surface manifest different 
effective diffusion constants. Equilibrium times teq can be fitted to the 2D diffusive model 
with the diffusion coefficients Di = 6.4 x 10~^k-QT'~f~^ for Rqi and the position-dependent D2 
for -Rem, shown in Fig. ^h) 12|. Note that D2 decreases by an order of magnitude when the 
polymer enters the free energy basin of its collapsed state, in agreement with the previous 
work on two-state protein folding where a similar slowdown in diffusion is attributed to the 
increasing polymer compactness 6|]. In experiments 13|] on the folding kinetics of protein 
L, a ten-fold slowdown in diffusion is observed after the initial hydrophobic collapse and 
attributed to the ruggedness of the energy landscape in the compact unfolded state. In the 
absence of a folding barrier, slow diffusion in the compact unfolded state may contribute 
significantly to the overall folding time, so that details of the energy landscape in this state 
may become very important. In the present case, a planar cut of the free energy surface 
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FIG. 3: (a) Collapse times icq (squares) and tncq (circles) from different initial positions in the 
1^0 trench. The solid curve is the fit of teq with Di = 6.4 x 10~^, describing lateral vibrations 
of collapse trajectories in the trench, and D2, shown as dash curve in (b). Triangles (right axis) 
are tneq/^eq- In (b), the solid curve is the planar cut Rqi = 2a of the 2D free energy surface. The 
units are k-^T for F, a for Rem and k-QT'^~^ for Di, D2- 

in Fig. [3](b) indicates that the depth of the ground state basin is ~ too large to 

be computed from irreversible collapse trajectories. Estimating the drag coefficient 7 from 
Stokes' formula, we find the mean diffusion coefficient D2 to be in good agreement with 



the value obtained from experiments on hydrophobic collapse of proteins This is not 



surprising since hydrophobic collapse of many heteropolymers involves diffusion and mutual 



aggregation of locally collapsed structures ISj , the same elementary processes that take part 
in the 1 — > transition described by D2- 

In closing, we can draw some general conclusions from the behavior of this relatively 
simple HP model. Most applications of the diffusive model to protein folding rely on the 
assumptions that i) the protein is in local equilibrium during its folding, ii) the energy 
landscape can be proiected onto a single reaction coordinate, and Hi) the effective diffusion 

riQ 

coefficient is constant, with noted exceptions [6|, 17|. We have presented an efficient method 
for the calculation of multi-dimensional free energy surfaces that has allowed us to determine 
the validity of these assumptions when applied to hydrophobic collapse as a basic model for 
downhill protein folding. We find that polymer collapse generally adheres to the equilibrium 
energy landscape, although one cannot neglect nonequilibrium effects which, in the present 
case, noticeably delay collapse or cause it to take an altered pathway. Nonequilibrium 
features are not captured in equilibrium free energy calculations and can only be revealed 
by additional analysis of collapse trajectories. At the same time, nonequilibrium effects bias 
the free energy estimates that are obtained from collapse/folding trajectories. Irreversible 
conformational changes, associated with folding, also interfere with accurate determination 
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of free e nerg ies from folding trajectories. Furthermore we find, in agreement with previous 
work j^, [l3j . that diffusion slows down by an order of magnitude in the collapsed state, 
resulting from the increasing polymer compactness. In the case of downhill protein folding, 
this slow diffusion in the collapsed state may contribute significantly to the overall folding 
time. Finally, as illustrated by trajectory A in Fig. [2](b), fluctuations in the final state are 
not necessarily described by the same reaction coordinate as collapse to this state, so that 
projecting trajectories onto a single reaction coordinate may turn out to be inaccurate even 
in the case of a single apparent pathway. In the present study two reaction coordinates have 
proved sufficient to build a thermodynamically and kinetically consistent diffusive model 
with two distinct, physically meaningful diffusion coefficients. 

This work was supported by NSF through Grant No. CHE-0517818 and through TeraGrid 
Grant No. TG-CHE070075N. 
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